首页> 外文OA文献 >Implementation of decoders for LDPC block codes and LDPC convolutional codes based on GPUs
【2h】

Implementation of decoders for LDPC block codes and LDPC convolutional codes based on GPUs

机译:基于GPU的LDPC块码和LDPC卷积码解码器的实现

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In this paper, efficient LDPC block-code decoders/simulators which run on graphics processing units (GPUs) are proposed. We also implement the decoder for the LDPC convolutional code (LDPCCC). The LDPCCC is derived from a predesigned quasi-cyclic LDPC block code with good error performance. Compared to the decoder based on the randomly constructed LDPCCC code, the complexity of the proposed LDPCCC decoder is reduced due to the periodicity of the derived LDPCCC and the properties of the quasi-cyclic structure. In our proposed decoder architecture, (Γ) (Γ) is a multiple of a warp) codewords are decoded together, and hence, the messages of (Γ) codewords are also processed together. Since all the (Γ) codewords share the same Tanner graph, messages of the (Γ) distinct codewords corresponding to the same edge can be grouped into one package and stored linearly. By optimizing the data structures of the messages used in the decoding process, both the read and write processes can be performed in a highly parallel manner by the GPUs. In addition, a thread hierarchy minimizing the divergence of the threads is deployed, and it can maximize the efficiency of the parallel execution. With the use of a large number of cores in the GPU to perform the simple computations simultaneously, our GPU-based LDPC decoder can obtain hundreds of times speedup compared with a serial CPU-based simulator and over 40 times speedup compared with an eight-thread CPU-based simulator.
机译:在本文中,提出了在图形处理单元(GPU)上运行的高效LDPC块码解码器/模拟器。我们还为LDPC卷积码(LDPCCC)实现了解码器。 LDPCCC源自具有良好错误性能的预先设计的准循环LDPC块代码。与基于随机构造的LDPCCC码的解码器相比,所提出的LDPCCC解码器的复杂性由于导出的LDPCCC的周期性和准循环结构的特性而降低了。在我们提出的解码器体系结构中,(Γ)(Γ)是一个warp)码字被一起解码,因此,(Γ)码字的消息也被一起处理。由于所有(Γ)码字共享相同的Tanner图,因此对应于同一边的(Γ)不同码字的消息可以分组为一个包并线性存储。通过优化解码过程中使用的消息的数据结构,GPU可以以高度并行的方式执行读取和写入过程。另外,部署了使线程分歧最小化的线程层次结构,并且可以最大化并行执行的效率。通过在GPU中使用大量内核来同时执行简单的计算,与基于串行CPU的模拟器相比,我们基于GPU的LDPC解码器可以获得数百倍的加速,而与八线程相比则可以达到40倍以上的加速基于CPU的模拟器。

著录项

  • 作者

    Zhao, Y; Lau, FCM;

  • 作者单位
  • 年度 2014
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号